11 research outputs found
Online Social Network Friends and Spatio-temporal Proximity of Their Geotagged Photos – A Case Study of Flickr Data
This empirical study aims to analyze relationships between online social network (OSN) friends and spatio-temporal proximity of their geotagged photos, using Flickr data as a case study. First, this study analyzes whether Flickr friends tend to post geotagged photos that are closer to each other compared to Flickr non-friends in space and time. Second, this study investigates whether the number of geotagged photos posted by users is related to the distance and time difference between their geotagged photos. Third, this study examines the spatial distributions of geotagged photos of Flickr friends within specific distance intervals to further understand the geographic meanings of Flickr user’s geotagging activities. Findings of this study can improve our understanding of the relationship between users’ virtual friendships and their physical activities. These understandings can support future research, including location-based services, location-based OSN searches, and location-based online marketing
POIReviewQA: A Semantically Enriched POI Retrieval and Question Answering Dataset
Many services that perform information retrieval for Points of Interest (POI)
utilize a Lucene-based setup with spatial filtering. While this type of system
is easy to implement it does not make use of semantics but relies on direct
word matches between a query and reviews leading to a loss in both precision
and recall. To study the challenging task of semantically enriching POIs from
unstructured data in order to support open-domain search and question answering
(QA), we introduce a new dataset POIReviewQA. It consists of 20k questions
(e.g."is this restaurant dog friendly?") for 1022 Yelp business types. For each
question we sampled 10 reviews, and annotated each sentence in the reviews
whether it answers the question and what the corresponding answer is. To test a
system's ability to understand the text we adopt an information retrieval
evaluation by ranking all the review sentences for a question based on the
likelihood that they answer this question. We build a Lucene-based baseline
model, which achieves 77.0% AUC and 48.8% MAP. A sentence embedding-based model
achieves 79.2% AUC and 41.8% MAP, indicating that the dataset presents a
challenging problem for future research by the GIR community. The result
technology can help exploit the thematic content of web documents and social
media for characterisation of locations